Co-changing code volume prediction through association rule mining and linear regression model

نویسندگان

  • Shin-Jie Lee
  • Li Hsiang Lo
  • Yu-Cheng Chen
  • Shi-Min Shen
چکیده

Code smells are symptoms in the source code that indicate possible deeper problems andmay serve as drivers for code refactoring. Although effort has been made on identifying divergent changes and shotgun surgeries, little emphasis has been put on predicting the volume of co-changing code that appears in the code smells. More specifically, when a software developer intends to perform a particular modification task on amethod, a predicted volume of code that will potentially be co-changed with the method could be considered as significant information for estimating the modification effort. In this paper, we propose an approach to predicting volume of co-changing code affected by a method to be modified. The approach has the following key features: co-changing methods can be identified for detecting divergent changes and shotgun surgeries based on association rules mined from change histories; and volume of co-changing code affected by a method to be modified can be predicted through a derived fitted regression line with t-test based on the co-changing methods identification results. The experimental results show that the success rate of co-changing methods identification is 82% with a suggested threshold, and the numbers of correct identifications would not be influenced by the increasing number of commits as a project continuously evolves. Additionally, the mean absolute error of co-changing code volume predictions is 133 lines of code which is 95.3% less than the one of a naive approach. © 2015 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of Blasting Cost in Limestone Mines Using Gene Expression Programming Model and Artificial Neural Networks

The use of blasting cost (BC) prediction to achieve optimal fragmentation is necessary in order to control the adverse consequences of blasting such as fly rock, ground vibration, and air blast in open-pit mines. In this research work, BC is predicted through collecting 146 blasting data from six limestone mines in Iran using the artificial neural networks (ANNs), gene expression programming (G...

متن کامل

Numeric Multi-Objective Rule Mining Using Simulated Annealing Algorithm

Abstract as a single objective one. Measures like support, confidence and other interestingness criteria which are used for evaluating a rule, can be thought of as different objectives of association rule mining problem. Support count is the number of records, which satisfies all the conditions that exist in the rule. This objective represents the accuracy of the rules extracted from the da...

متن کامل

Studying Co-evolution of Production & Test Code Using Association Rule Mining

Unit tests are generally acknowledged as an important aid to produce high quality code, as they provide quick feedback to developers on the correctness of their code. In order to achieve high quality, well-maintained tests are needed. Ideally, tests co-evolve with the production code to test changes as soon as possible. In this paper, we explore an approach to determine whether production and t...

متن کامل

Prediction of ultimate strength of shale using artificial neural network

A rock failure criterion is very important for prediction of the ultimate strength in rock mechanics and geotechnics; it is determined for rock mechanics studies in mining, civil, and oil wellborn drilling operations. Also shales are among the most difficult to treat formations. Therefore, in this research work, using the artificial neural network (ANN), a model was built to predict the ultimat...

متن کامل

Development of a site-specific regression model for assessment of road-header cutting performance of Tabas coal mine based on rock properties

In underground excavation, where the road-headers are employed, a precise prediction of the road-header performance has a vital role in the economy of the project. In this paper, a new model is developed for prediction of the road-header performance using the non-linear multivariate regression analysis. This model is able to estimate the instantaneous cutting rate (ICR) of roadheader based on r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2016